Speech Enhancement Using Linear Prediction and Dct Coefficients

نویسندگان

  • Diksha Sharma
  • Rupinder Kaur
چکیده

One implicit assumption the speech enhancement algorithms is that the representation of speech in a transform domain or over a redundant dictionary is sparse, while that of noise is dense. Based on this assumption, clean speech can be recovered by finding the sparse representations. However, some kinds of noise are also found sparse in the above representation scenarios, which results in degradation of enhancement performance. For example, since coefficients of car interior noise are sparse in DCT domain, the speech enhancement performance for car interior noise is not as good as that in other noisy background. In addition, some features, for example, speech energy and the inter-frame correlation, are not considered sufficiently in the available speech enhancement algorithms which probably hinders the further improvement of speech quality. In the proposed method, speech enhancement is casted to an optimization problem, where linear prediction residual and DCT coefficients are combined and adopted as the representation of speech to ensure that noise is dense in such domain. Other features, including speech energy, noise energy and correlation are also considered as constraints to improve the quality and intelligibility of recovered speech. The proposed algorithm adopts LP residual as one of the sparse representation of speech, considering it is feasible and advantageous, as analyzed in the previous section. To make full use of the sparsity of speech, DCT coefficients are also included to contribute as a measurement. The proposed algorithm aims to recover the clean speech, whose LP residual and DCT coefficients are both sparse, via solving an optimization problem under a series of constraints. .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

Analysis of Correlation between Audio and Audio Feature Predic

The aim of this work is to examine the correlation between audio and visual speech features. The motivation is to find visual features that can provide clean audio feature estimates which can be used for speech enhancement when the original audio signal is corrupted by noise. Two audio features (MFCCs and formants) and three visual features (active appearance model, 2-D DCT and cross-DCT) are c...

متن کامل

Estimation of LPC coefficients using Evolutionary Algorithms

The vast use of Linear Prediction Coefficients (LPC) in speech processing systems has intensified the importance of their accurate computation. This paper is concerned with computing LPC coefficients using evolutionary algorithms: Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Dif-ferential Evolution (DE) and Particle Swarm Optimization with Differentially perturbed Velocity (PSO-DV...

متن کامل

A brief survey of a DCT-Based Speech Enhancement System

Discrete Cosine Transform (DCT) has similar performance to the Karhunen-Loeve Transform (KLT) & same properties to the Discrete Fourier Transform (DFT). It is advantageous for speech enhancement as it provides better energy compaction capability. Though there is a perfectly stationary signal, frame to frame variations of DCT coefficients are observed. In pitch synchronous analysis DCT based spe...

متن کامل

Modification of pitch using DCT in the source domain

In this paper, we propose a novel algorithm for pitch modification. The linear prediction residual is obtained from pitch synchronous frames by inverse filtering the speech signal. Then the Discrete Cosine Transform (DCT) of these residual frames is taken. Based on the desired factor of pitch modification, the dimension of the DCT coefficients of the residual is modified by truncating or zero p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014